Predicting protein secondary structure and solvent accessibility with an improved multiple linear regression method.

نویسندگان

  • Sanbo Qin
  • Yun He
  • Xian-Ming Pan
چکیده

We have improved the multiple linear regression (MLR) algorithm for protein secondary structure prediction by combining it with the evolutionary information provided by multiple sequence alignment of PSI-BLAST. On the CB513 dataset, the three states average overall per-residue accuracy, Q(3), reached 76.4%, while segment overlap accuracy, SOV99, reached 73.2%, using a rigorous jackknife procedure and the strictest reduction of eight states DSSP definition to three states. This represents an improvement of approximately 5% on overall per-residue accuracy compared with previous work. The relative solvent accessibility prediction also benefited from this combination of methods. The system achieved 77.7% average jackknifed accuracy for two states prediction based on a 25% relative solvent accessibility mode, with a Mathews' correlation coefficient of 0.548. The improved MLR secondary structure and relative solvent accessibility prediction server is available at http://spg.biosci.tsinghua.edu.cn/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification Comparison of Prediction of Solvent Accessibility From Protein Sequences

The prediction of residue solvent accessibility from protein sequences has been studied by various methods. The direct comparison of these methods is impossible due to the variety of datasets used and the difference in structure definition. In this paper we choose 5 classification approaches (decision tree (DT), Support Vector Machine (SVM), Bayesian Statistics (BS) , Neural Network (NN) and Mu...

متن کامل

Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure.

The present study is an attempt to develop a neural network-based method for predicting the real value of solvent accessibility from the sequence using evolutionary information in the form of multiple sequence alignment. In this method, two feed-forward networks with a single hidden layer have been trained with standard back-propagation as a learning algorithm. The Pearson's correlation coeffic...

متن کامل

Ca Bios Invited Review

The problem of predicting protein structure from the sequence remains fundamentally unsolved despite more than three decades of intensive research effort. However, new and promising methods in three-dimensional (3D), 2D and ID prediction have reopened the field. Mean-forcepotentials derived from the protein databases can distinguish between correct and incorrect models (3D). Inter-residue conta...

متن کامل

Sequence based prediction of relative solvent accessibility using two-stage support vector regression with confidence values

sequences and the number of known structures. Predicted relative solvent accessibility (RSA) Despite several decades of extensive research in terprovides useful information for prediction of tiary structure prediction, this task is still a big chalbinding sites and reconstruction of the 3Dlenge, especially for sequences that do not have a sigstructure based on a protein sequence. nificant seque...

متن کامل

Prediction of structural features and application to outer membrane protein identification

Protein three-dimensional (3D) structures provide insightful information in many fields of biology. One-dimensional properties derived from 3D structures such as secondary structure, residue solvent accessibility, residue depth and backbone torsion angles are helpful to protein function prediction, fold recognition and ab initio folding. Here, we predict various structural features with the ass...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proteins

دوره 61 3  شماره 

صفحات  -

تاریخ انتشار 2005